Stacked Ensembles of Information Extractors for Knowledge-Base Population by Combining Supervised and Unsupervised Approaches

نویسندگان

  • Nazneen Fatema Rajani
  • Raymond J. Mooney
چکیده

The UTAustin team participated in two main tasks this year the Cold Start Slot Filling (CSSF) task and the Slot-Filler Validation/Ensembling task, which was divided into the filtering and ensembling subtasks. Our system uses stacking to ensemble multiple systems for the KBP slot filling task, as described in our ACL 2015 paper. We expand the stacking approach by allowing the classifier to also utilize additions features that are relevant to making a final decision. Stacking relies on supervised training and hence requires common systems from the 2014 data to be used as training. However, that approach has limitations on performance and therefore we propose a novel approach of combining the supervised approach with an unsupervised approach on the remaining systems. We believe this combination approach gives our best run for the ensembling task. In this paper, we also discuss strategies to handle Cold Start data which comes from multiple hops.

برای دانلود متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید

ثبت نام

اگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید

منابع مشابه

Knowledge Base Population using Stacked Ensembles of Information Extractors

................................................................................................... vii Chapter

متن کامل

Extreme Extraction: Only One Hour per Relation

Information Extraction (IE) aims to automatically generate a large knowledge base from natural language text, but progress remains slow. Supervised learning requires copious human annotation, while unsupervised and weakly supervised approaches do not deliver competitive accuracy. As a result, most fielded applications of IE, as well as the leading TAC-KBP systems, rely on significant amounts of...

متن کامل

Stacked Ensembles of Information Extractors for Knowledge-Base Population

We present results on using stacking to ensemble multiple systems for the Knowledge Base Population English Slot Filling (KBP-ESF) task. In addition to using the output and confidence of each system as input to the stacked classifier, we also use features capturing how well the systems agree about the provenance of the information they extract. We demonstrate that our stacking approach outperfo...

متن کامل

Evaluating Unsupervised Ensembles when applied to Word Sense Induction

Ensembles combine knowledge from distinct machine learning approaches into a general flexible system. While supervised ensembles frequently show great benefit, unsupervised ensembles prove to be more challenging. We propose evaluating various unsupervised ensembles when applied to the unsupervised task of Word Sense Induction with a framework for combining diverse feature spaces and clustering ...

متن کامل

Consensus Maximization Fusion of Probabilistic Information Extractors

Current approaches to Information Extraction (IE) are capable of extracting large amounts of facts with associated probabilities. Because no current IE system is perfect, complementary and conflicting facts are obtained when different systems are run over the same data. Knowledge Fusion (KF) is the problem of aggregating facts from different extractors. Existing methods approach KF using superv...

متن کامل

ذخیره در منابع من


  با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید

برای دانلود متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید

ثبت نام

اگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید

عنوان ژورنال:

دوره   شماره 

صفحات  -

تاریخ انتشار 2015